Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics

نویسندگان

Konstantinos I. Chatzilygeroudis

Jean-Baptiste Mouret

چکیده

The most data-efficient algorithms for reinforcement learning in robotics are model-based policy search algorithms, which alternate between learning a dynamical model of the robot and optimizing a policy to maximize the expected return given the model and its uncertainties. Among the few proposed approaches, the recently introduced BlackDROPS algorithm exploits a black-box optimization algorithm to achieve both high data-efficiency and good computation times when several cores are used; nevertheless, like all modelbased policy search approaches, Black-DROPS does not scale to high dimensional state/action spaces. In this paper, we introduce a new model learning procedure in Black-DROPS that leverages parameterized black-box priors to (1) scale up to high-dimensional systems, and (2) be robust to large inaccuracies of the prior information. We demonstrate the effectiveness of our approach with the “pendubot” swing-up task in simulation and with a physical hexapod robot (48D state space, 18D action space) that has to walk forward as fast as possible. The results show that our new algorithm is more data-efficient than previous model-based policy search algorithms (with and without priors) and that it can allow a physical 6-legged robot to learn new gaits in only 16 to 30 seconds of interaction time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Optimization for Contextual Policy Search*

Contextual policy search allows adapting robotic movement primitives to different situations. For instance, a locomotion primitive might be adapted to different terrain inclinations or desired walking speeds. Such an adaptation is often achievable by modifying a relatively small number of hyperparameters; however, learning when performed on an actual robotic system is typically restricted to a ...

متن کامل

Model Identification via Physics Engines for Improved Policy Search

This paper presents a practical approach for identifying unknown mechanical parameters, such as mass and friction models of manipulated rigid objects or actuated robotic links, in a succinct manner that aims to improve the performance of policy search algorithms. Key features of this approach are the use of off-the-shelf physics engines and the adaptation of a black-box Bayesian optimization fr...

متن کامل

Black Box Optimization of PID controllers for Micro Aerial Vehicles

When operating flying robots we repeatedly face the necessity of tuning controllers for particular electronic and physical configurations. In the spirit of embodied robotic learning we want to tune the controllers during the system’s closed-loop operation. We present an experiment applying a black box Bayesian optimization technique to the problem of tuning a hierarchical vertical position PID ...

متن کامل

Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning

The democratization of robotics technology and the development of new actuators progressively bring robots closer to humans. The applications that can now be envisaged drastically contrast with the requirements of industrial robots. In standard manufacturing settings, the criterions used to assess performance are usually related to the robot’s accuracy, repeatability, speed or stiffness. Learni...

متن کامل

Distributed Black-Box Software Testing Using Negative Selection

In the software development process, testing is one of the most human intensive steps. Many researchers try to automate test case generation to reduce the manual labor of this step. Negative selection is a famous algorithm in the field of Artificial Immune System (AIS) and many different applications has been developed using its idea. In this paper we have designed a new algorithm based on nega...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1709.06917 شماره

صفحات -

تاریخ انتشار 2017

Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics

نویسندگان

چکیده

منابع مشابه

Bayesian Optimization for Contextual Policy Search*

Model Identification via Physics Engines for Improved Policy Search

Black Box Optimization of PID controllers for Micro Aerial Vehicles

Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning

Distributed Black-Box Software Testing Using Negative Selection

عنوان ژورنال:

اشتراک گذاری